Mobile robot path planning with reformative bat algorithm

Mobile robot path planning has attracted much attention as a key technology in robotics research. In this paper, a reformative bat algorithm (RBA) for mobile robot path planning is proposed, which is employed as the control mechanism of robots. The Doppler effect is applied to frequency update to ameliorate RBA. When the robot is in motion, the Doppler effect can be adaptively compensated to prevent the robot from prematurely converging. In the velocity update and position update, chaotic map and dynamic disturbance coefficient are introduced respectively to enrich the population diversity and weaken the limitation of local optimum. Furthermore, Q-learning is incorporated into RBA to reasonably choose the loudness attenuation coefficient and the pulse emission enhancement coefficient to reconcile the trade-off between exploration and exploitation, while improving the local search capability of RBA. The simulation experiments are carried out in two different environments, where the success rate of RBA is 93.33% and 90%, respectively. Moreover, in terms of the results of success rate, path length and number of iterations, RBA has better robustness and can plan the optimal path in a relatively short time compared with other algorithms in this field, thus illustrating its validity and reliability. Eventually, by the aid of the Robot Operating System (ROS), the experimental results of real-world robot navigation indicate that RBA has satisfactory real-time performance and path planning effect, which can be considered as a crucial choice for dealing with path planning problems.


Introduction
As the representative of high-end intelligent equipment and high-tech, mobile robot technology is changing with each passing day, which has been widely applied in family services, rescue and relief, warehousing and logistics, and other practical application fields. In order to achieve the shortest collision-free movement of the mobile robot from the starting point to the target point, the path planning of the mobile robot has become a hot spot of current research, and has attracted close attention of relevant scholars. To date, a variety of effective methods have been developed to deal with path planning problems, such as visibility graph [1,2], artificial potential field (APF) [3,4], rapidly-exploring random tree (RRT) [5], reinforcement learning a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 above do not take this factor into account. Hence, there is still plenty of room for improvement in their performance.
In order to further improve BA and better complete the path planning task in static environments, this paper puts forward a reformative BA, named RBA, in which all robots are regarded as bats, and one robot represents one bat. Moreover, RBA is employed as the robots ' control mechanism to realize the robots' search for the target, thereby accomplishing the path planning task. The main contributions of RBA are highlighted in the following aspects: (1) The Doppler effect is applied to the frequency update to ameliorate RBA. When the robot is in motion, the Doppler effect can be adaptively compensated to prevent the robot from prematurely converging. (2) In the velocity update and position update, chaotic map and dynamic disturbance coefficient are introduced respectively to enrich the population diversity and weaken the limitation of local optimum. (3) Q-learning is adopted to make reasonable choices for the loudness attenuation coefficient and the pulse emission enhancement coefficient to coordinate the trade-off between exploration and exploitation, while improving the local search capability of RBA. To verify the validity and reliability of RBA, simulation experiments are carried out in two different environments. To begin with, the original RBA is compared with five classical swarm intelligence optimization algorithms, including PSO, BA, FA, TLBO and WOA. The experimental results demonstrate that RBA has good comprehensive performance and can effectively and reliably implement the optimal path planning. Subsequently, RBA is compared with four PSO variants, namely BPSO [33], PSO-DE [34], CMOPSO [35] and SLPSO [36]. Experimental results show that contrasted with PSO variants, RBA has superior search performance and stronger robustness. Finally, the proposed RBA is compared with three other state-of-the-art BA variants, i.e. EBat [28], PTRBA [37] and ARBA [38]. Experimental results indicate that RBA can give consideration to optimization effect and computational efficiency, and has excellent robustness. With the help of ROS, real-world robot navigation experiments are also carried out. The related results reveal that RBA has satisfactory real-time performance and path planning effect, and can be considered as a crucial choice for dealing with path planning problems.
The remainder of this paper is organized as follows. In 'Bat algorithm' and 'Q-learning', we review the knowledge of BA and Q-learning, respectively. The proposed RBA is described in detail in 'Reformative bat algorithm (RBA)'. To evaluate the proposed approach, simulation experiments are conducted in 'Simulation testing' and real-world robot navigation experiments are finished in 'Real-world case'. In the end, conclusions are drawn and future work is provided in 'Conclusions and future work'.

Bat algorithm
BA was first introduced in 2010, inspired by bats' echolocation behavior in search of prey. In nature, bats emit ultrasonic pulses and analyze reflected ultrasonic waves to determine the information of prey. Besides, bats can search for prey by changing their ultrasonic frequency, velocity and position. In the process of approaching prey, bats will increase the emissivity of ultrasonic pulses and weaken the loudness. The implementation of BA is based on the following assumptions. (1) All bats use echolocation to sense distance, and they can accurately distinguish between prey and obstacles. (2) Bats can automatically adjust the frequency and emissivity of the pulses according to the proximity of the target. (3) It is assumed that the loudness changes from a maximum value to a fixed minimum value.
The frequency, velocity and position values of each bat can be calculated as where f max and f min are the maximum and minimum values of the search pulse frequency, respectively; β 2 [0, 1] is a uniformly distributed random number; x � indicates the optimal position of all current bats. For the local search stage, a new result is performed in accordance with the following: where x old is the current best solution, x new is the new solution generated after the local search; � 2 [−1, 1] is a random number; A t is the average loudness of all bats at iteration t. The iterative equations for loudness A i and pulse emissivity r i are expressed as follows: where α and γ are constants; r 0 i is the initial pulse emissivity. For any 0 < α < 1 and γ > 0, we The pseudo code of BA is listed in Algorithm 1. As can be seen from Algorithm 1, the pulse emissivity r i controls whether BA can perform local search, and the loudness A i determines the local search performance of BA. Furthermore, according to Eqs (5) and (6), it is distinct that the loudness attenuation coefficient α and the pulse emission enhancement coefficient γ play a vital role in the iterative process of loudness and pulse emissivity, respectively. Therefore, in order to effectively coordinate the balance between exploration and exploitation and improve the local search capability of BA, it is necessary to reasonably choose the loudness attenuation coefficient and the pulse emission enhancement coefficient. In this paper, Q-learning is employed to tackle this issue. The details will be given in 'Parameters preselection'.

Q-learning
Q-learning is a trial and error learning method, whose purpose is to learn optimal strategies to accumulate rewards, so as to maximize the Q-value. The Q-value is updated as follows: where re(s t , a t ) is an immediate reward; η is a discount factor; μ is the learning rate, which controls the learning speed. Within a certain range of values, the larger the μ, the faster the convergence.
In this paper, greedy strategy is chosen as action selection strategy. The greedy strategy, as the name implies, aims to select the action that maximizes the Q-value. The relevant equation is expressed as Reformative bat algorithm (RBA) As mentioned in 'Introduction', BA has both advantages and challenges. Thus, in this section, the RBA is proposed to address the corresponding challenges and significantly improve the BA. On the one hand, the Doppler effect, chaotic map and dynamic disturbance coefficient are utilized to assist RBA to avoid premature convergence and weaken the limitation of local optimum. On the other hand, by means of Q-learning, RBA can effectively solve the challenges of BA caused by the poor coordination between loudness attenuation coefficient and pulse emission enhancement coefficient. Algorithm 1 Pseudo code of BA.

Doppler effect
According to Eq (1), we can intuitively see that the frequency update of BA has a strong randomness, resulting in the planned path is not smooth enough, and premature convergence may occur. Consequently, the Doppler effect is introduced to ameliorate the frequency update of BA. The improved frequency calculation formula is expressed as where ξ i is the observation frequency, ξ 0 is the original emission frequency of the emission source (target); v is the velocity of wave propagation; v t i is the movement velocity of the observer (robot), if the observer is close to the emission source, the operator in front is "+", otherwise it is "-"; v s is the movement velocity of the emission source, if the emission source is close to the observer, the operator in front is "-", otherwise it is "+".
In the light of Eq (10), we can discover that in the Doppler effect, the frequency will change as the distance between the robot and the target changes. Hence, the robot can adaptively compensate for the Doppler effect during the movement, and then regulate the velocity by adaptively adjusting the frequency, thereby avoiding premature convergence.

Improved model for velocity and position
In RBA, the velocity and position values can be updated as The standard BA uses Eq (3) to update the position, in which the calculation of v t i is inseparable from x t i À x � . Hence, when conducting the global search, BA is directly constrained by x t i À x � , and it is easy to fall into local optima. In response to this problem, the attenuation coefficient σ is introduced in Eq (12). Since chaotic map has the merits of ergodicity, nonrepeatability and sensitivity, we select chaotic map to update σ, where z 2 (0, 1) is a constant and t represents the current iteration number. Based on Eq (12), it is evident that the value range of σ always belongs to (0, 1). Therefore, the limitation of local optimum is reduced. In addition, the dynamic disturbance coefficient ω is put forward as shown in Eq (14), where τ is the disturbance deviation factor and betarnd() is a random number obeying the beta distribution. The dynamic disturbance coefficient ω decreases adaptively with the increase of the number of iterations. Consequently, in the early stage, the dynamic disturbance coefficient ω has a large disturbance to the position update, which is conducive to expanding the search scope of bats. In the later stage, the dynamic disturbance coefficient ω reduces the disturbance to the position update, which is beneficial to the stability of the algorithm. Through many experiments, the constant z and the disturbance deviation factor τ are set to 0.5 and 0.1, respectively.

Parameters preselection
In BA, the quality of optimization results is determined by loudness attenuation coefficient α and pulse emission enhancement coefficient γ. If the above parameters are not properly coordinated, the convergence speed of BA will be affected, making it difficult to ensure the path planning effect. Therefore, in the local search phase, Q-learning is applied to preselect the optimal combinations of the above parameters to ameliorate the optimization effect of BA. The relevant idea is displayed in Fig 1. In Fig 1, < α, γ > set is composed of the loudness attenuation coefficient α and the pulse emission enhancement coefficient γ, and a < α, γ > combination corresponds to an action in Q-learning. X i (t) is defined as the position of the ith bat at iteration t. Moreover, R i (t) is the fitness function value of the bat at position X i (t), which is defined as the state of Q-learning. The combination of BA and Q-learning can be described as selecting the optimal combination < α 0 , γ 0 > from the < α, γ > set according to Eq (8) when the state is R i (t). In BA, the optimal combination < α 0 , γ 0 > is utilized to obtain the next position X i (t + 1) of the bat, and then the Q-value of the next state R i (t+ 1) is estimated. On the other hand, when the optimal action < α 0 , γ 0 > acts on the environment, the corresponding immediate reward re(R i (t), < α 0 , γ 0 >) will be generated. The immediate reward is set to the difference between the fitness function values of the bats in successive iterations. The related equation is executed as follows: Owing to the application of the Q-learning, in the local search phase, each bat position has its corresponding optimal < α 0 , γ 0 > combination, and all the information is saved in the Qtable. In the implementation stage, RBA can directly select the optimal < α 0 , γ 0 > combinations from the Q-table, thus overcoming the defects of the standard BA due to the parameters are not well coordinated.

Fitness function
In this paper, the fitness function is designed in the light of the following evaluation criteria.
(1) No collision with obstacles. (2) Achieve the shortest path length. The corresponding fitness function is expressed as where L is the path length of the mobile robot from the starting point to the target point, which conforms to Eq (17), ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi � p is the penalty term used to exclude paths that collide with obstacles. The value of � p is set to 100. λ is the flag variable with an initial value of 0. The update process of λ is as follows: for k = 1: nobs ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ðxx À xobs k Þ 2 þ ðyy À yobs k Þ 2 q ; ð18Þ end Given that the robot has a certain volume, the obstacles are expanded to prevent the robot from hitting the obstacles. nobs is the total number of obstacles. (xobs k , yobs k ) and robs k are the center coordinate and maximum influence radius of the kth expanded obstacle, respectively. d k is the distance from the point on the path to the center coordinate of the obstacle. For λ, if there is no collision between the robot and the obstacle, then λ = 0. However, if the robot collides with the obstacle, λ is a positive number greater than 0. Hence, when the fitness function fit reaches the maximum value, the shortest collision-free path can be obtained.

Implementation of RBA
After model improvement and parameters preselection, RBA will be implemented into path planning. In the global search stage, the Doppler effect, attenuation coefficient and dynamic disturbance coefficient are added to the RBA. Consequently, unlike standard BA, the frequency, velocity and position values in RBA are updated according to Eqs (9)- (14). In the local search stage, RBA can directly select the corresponding optimal <α 0 , γ 0 > combination from the Q-table on the basis of the current position of the bat, which can significantly improve the optimization performance of the algorithm.
The pseudo code of RBA is given in Algorithm 2. Algorithm 2: Pseudo code of RBA.
Define pulse frequency f i at x i ; Initialize values for pulse emissivity r i and loudness A i ; while t � T max do Adjust frequency by Eqs (9) and (10); Update velocities by Eqs (11) and (12); Update positions by Eqs (13) and (14); if rand > r i then Select a best position; Select the optimal combination < α 0 , γ 0 > from the Q-   (10), (12) and (14), it is apparent that only simple numerical operations are involved in M 1 and M 2 . Hence, the time computational complexity of the proposed RBA is only slightly increased compared to that of BA.

Experimental setup
In

Test case 1
The map used in test case 1 contains nine obstacles. The shortest collision-free path on this map is shown in Fig 5(a), where (0, 0) is the starting point, (8,10) is the target point, and the optimal path length is approximately 13.1716.
Comparison with classical algorithms. Five classical optimization algorithms are compared with our approach to demonstrate the superiority of the proposed RBA. In order to objectively analyze the performance of the algorithms and avoid contingency, each algorithm runs 30 times on the map. After 180 experiments, the experimental results are shown in Fig 2. In the 30 experiments of each algorithm, RBA realizes the optimal path 28 times, PSO realizes the optimal path 17 times, BA realizes the optimal path 26 times, FA realizes the optimal path 16 times, TLBO realizes the optimal path 30 times, and WOA realizes the optimal path 20 times. Consequently, the success rates of the above algorithms are 93.33%, 56.67%, 86.67%, 53.33%, 100%, and 66.67%, respectively.
For an in-depth understanding of the distribution of the experimental data in Fig 2, the mean and standard deviation of the relevant data are listed in Table 1. According to Fig 2(a) and Table 1, the path length data curve of TLBO is very stable. This is because in 30 experiments, TLBO has planned the optimal path every time, which also confirms that TLBO has the best ability to search for the global optimum. Excluding TLBO algorithm, among the remaining algorithms, it is clear that RBA has better performance compared with other algorithms, and its path length data curve shows relatively small fluctuations. In the light of Table 1, it is obvious that the average path length of RBA is 13.3019, and the standard deviation of the path length is only 0.1913. In terms of the number of iterations, as can be seen intuitively from Fig  2(b) and Table 1, the average number of iterations required for PSO to accomplish the path planning is the least, which is 12.43. The second is RBA, with an average number of iterations of 13.2. Although PSO can fulfill the path planning quickly, it has the defect that it is easy to fall into the local optima and cannot plan the optimal path effectively. This can be verified

PLOS ONE
Mobile robot path planning with reformative bat algorithm from the optimal path planning success rate, Fig 2(a) and Table 1. Relative to the excellent performance in path length, TLBO does not perform satisfactorily in the number of iterations. In 30 experiments, the average number of iterations of TLBO is 31.8, and the standard deviation of the number of iterations is as high as 12.7344. Furthermore, under the same experiment, the path planning results of the six algorithms are displayed in Fig 5(b). It is obvious that PSO, BA and FA are all trapped in local optima, and only RBA, TLBO and WOA plan the shortest path. Among them, in order to complete the optimal path planning, TLBO requires 8 iterations, WOA requires 60 iterations, while RBA only requires 5 iterations. Therefore, contrasted with the classical optimization algorithms, RBA can achieve the optimal path planning in a relatively short time, and the success rate can reach 93.33%, which demonstrates that RBA has the merits of rapid optimization speed and good optimization effect.
Comparison with PSO variants. In order to compare the path planning effects of RBA and PSO variants, 150 experiments are fulfilled, and the related experimental results are exhibited in Fig 3. In the 30 experiments of each algorithm, RBA realizes the optimal path 28 times, BPSO realizes the optimal path 13 times, PSO-DE realizes the optimal path 23 times, CMOPSO realizes the optimal path 22 times, and SLPSO realizes the optimal path 27 times.

PLOS ONE
Consequently, the success rates of the above algorithms are 93.33%, 43.33%, 76.67%, 73.33%, and 90%, respectively. On the basis of the success rate of each algorithm, it is obvious that in addition to SLPSO, other PSO variants have relatively poor performance and are prone to fall into local optimum, making it difficult to achieve optimal path planning.
To clearly analyze the experimental data in Fig 3, the mean and standard deviation of the corresponding data are shown in Table 2. It can be seen intuitively from Table 2 that the average path length of the five algorithms is roughly the same, while RBA has the smallest standard deviation of path length, which indicates that RBA has a more stable path planning effect. In terms of the number of iterations, RBA achieves the smallest mean and standard deviation values, which means that RBA can accomplish optimal path planning faster than PSO variants. Moreover, under the same experiment, the path planning results of the five algorithms are exhibited in Fig 5(c). Obviously, except BPSO, other algorithms plan the shortest path, among which, PSO-DE requires 9 iterations, CMOPSO requires 49 iterations, SLPSO requires 16 iterations, while RBA only requires 4 iterations.
Comparison with BA variants. To finish the performance comparison between RBA and other novel BA variants, we collate 120 experimental data in Fig 4 and present the mean and standard deviation of the relevant data in Table 3. In the 30 experiments of each algorithm, RBA realizes the optimal path 28 times, ARBA realizes the optimal path 27 times, PTRBA realizes the optimal path 17 times, and EBat realizes the optimal path 28 times. Consequently, the success rates of the above algorithms are 93.33%, 90%, 56.67%, and 93.33%, respectively. Based on Fig 4 and Table 3, it is distinct that the path planning effect of PTRBA is relatively poor. In our opinion, this is because the tangent random exploration mechanism is applied in the local search phase of PTRBA, which replaces � in Eq (4). The tangent random exploration mechanism is represented as tan(π � (ξ − 0.5)), where ξ is a random number belonging to [0, 1]. When the value of ξ approaches 0 or 1, the value of the tangent function approaches infinity. Therefore, in the iterative process of PTRBA, the phenomenon that the value of the tangent function is too large may occur, which influences the stability of the algorithm and reduces the path planning effect. For ARBA and EBat, their optimization performance is roughly the same and better than that of PTRBA. Compared with the aforementioned BA variants, RBA has more excellent path planning effects, not only in the path length but also in the number of iterations, thus verifying the superiority of RBA.
Besides, under the same experiment, the path planning results of the four algorithms are displayed in Fig 5(d). As the above analysis of the shortcomings of PTRBA, although PTRBA converges quickly, it plans a relatively long path and has poor optimization effect. In contrast with PTRBA, other algorithms plan the shortest path, among which, ARBA requires 40 iterations, EBat requires 50 iterations, while RBA only requires 9 iterations.

Test case 2
In order to further demonstrate the superiority of RBA, a more complex map is used in test case 2, which contains thirteen obstacles. The shortest collision-free path on this map is drawn in Fig 9(a), where (0, 0) is the starting point, (8,10) is the target point, and the optimal path length is approximately 13.1966. Comparison with classical algorithms. In the comparison experiment between RBA and five classical optimization algorithms, after 180 experiments, the experimental results are shown in Fig 6. In addition, the mean and standard deviation of the related experimental results are listed in Table 4. In the 30 experiments of each algorithm, RBA realizes the optimal path 27 times, PSO realizes the optimal path 16 times, BA realizes the optimal path 24 times, FA realizes the optimal path 16 times, TLBO realizes the optimal path 29 times, and WOA realizes the optimal path 18 times. Consequently, the success rates of the above algorithms are

PLOS ONE
90%, 53.33%, 80%, 53.33%, 96.67%, and 60%, respectively. For RBA, compared to the results of test case 1, the performance is slightly decreased. The probability of realizing the optimal path is reduced by 3.33%. On the basis of Table 4, it is clear that the average path length of RBA is 13.3436, and the standard deviation of the path length is 0.2469. Excluding TLBO algorithm, the RBA performs better than other algorithms. In terms of the number of iterations, PSO and FA can accomplish the path planning task faster than RBA. However, they are difficult to achieve the optimal path planning, and the success rate of optimal path planning is relatively low. Moreover, under the same experiment, the path planning results of the six algorithms are depicted in Fig 9(b). It is distinct that only RBA, BA, and TLBO fulfill the shortest path planning. Among them, in order to implement the optimal path, BA requires 10 iterations, TLBO requires 27 iterations, while RBA only requires 8 iterations. Thus, contrasted with classical optimization algorithms, RBA has good overall performance, not only achieves good path planning effect, but also has satisfactory robustness.
Comparison with PSO variants. To compare the optimization performance of RBA and PSO variants, 120 experiments are conducted and the experimental data are presented in Fig  7. In the 30 experiments of each algorithm, RBA realizes the optimal path 27 times, BPSO realizes the optimal path 12 times, PSO-DE realizes the optimal path 18 times, CMOPSO realizes the optimal path 18 times, and SLPSO realizes the optimal path 20 times. Consequently, the success rates of the above algorithms are 90%, 40%, 60%, 60%, and 66.67%, respectively. For  Table 5. It can be seen from Fig 7 and Table 5 that in terms of the number of iterations, in addition to CMOPSO, BPSO, PSO-DE and SLPSO can complete the path planning faster than RBA. Nevertheless, in the light of the success rate and path length results, PSO variants, especially BPSO and CMOPSO, have unsatisfactory global optimization performance and are prone to fall into local optima. Besides, under the same experiment, the path planning results   Fig 8. Meanwhile, the mean and standard deviation of the relevant experimental data are listed in Table 6. In the 30 experiments of each algorithm, RBA realizes the optimal path 27 times, ARBA realizes the optimal path 25 times, PTRBA realizes the optimal path 16 times, and EBat realizes the optimal path 25 times.  Consequently, the success rates of the above algorithms are 90%, 83.33%, 53.33%, and 83.33%, respectively. On the basis of Table 6, we can get that compared with the novel BA variants, RBA has better path planning effect, not only in the path length but also in the number of iterations, which further proves the superiority of RBA. Furthermore, under the same experiment, the path planning results of the four algorithms are displayed in Fig 9(d). It is evident that only RBA and EBat fulfill the shortest path planning, among which, EBat requires 71 iterations, while RBA only requires 12 iterations.  After the above tests, the validity and superiority of RBA have been verified. With the increase of environment complexity, the path planning effect of RBA is basically not influenced, and the optimal path can be realized in a relatively short time. Simultaneously, the robustness of RBA has also been proven, and it has good adaptability in complex environments.

Real-world case
Except for the above simulation experiments, real-world experiments are carried out to verify the real-time performance and effectiveness of our algorithm. The TurtleBot 2 mobile robot equipped with SLAMTEC RPLIDAR A3 is adopted as the experimental platform. In addition, the motion commands of the robot are generated by an IRU-K10 minicomputer with Ubuntu 16.04 and ROS Kinetic installed. The experimental environment map, robot localization and robot path planning are implemented by ROS packages gmapping, amcl and move_base, respectively.
The real-world experimental results are depicted in Fig 10, where the black circle indicates the robot, the green line is the global path of the robot planned by the proposed algorithm RBA, and the green arrow cluster is the particle cloud, representing the robot pose estimated by amcl. It is evident from Fig 10 that RBA can plan the global optimal path for the robot in real time. Additionally, along this path, the robot can reach the target point safely and efficiently, thus confirming the feasibility and validity of the proposed algorithm.

Conclusions and future work
In this study, a reformative BA is put forth and effectively addresses the mobile robot path planning problem, mainly relying on the following contributions. First, the Doppler effect is applied to frequency update to ameliorate RBA. When the robot is in motion, the Doppler effect can be adaptively compensated to prevent the robot from prematurely converging. Second, the chaotic map and dynamic disturbance coefficient are adopted in the velocity update and position update respectively to weaken the limitation of local optimum and expand the scope of global exploration. Third, Q-learning is integrated into RBA to make reasonable choices for the loudness attenuation coefficient and the pulse emission enhancement coefficient to optimize the algorithm performance and improve the local exploitation capability. Various simulation results verify the effectiveness and superiority of RBA. Compared with other algorithms, RBA has good comprehensive performance in path planning tasks. Furthermore, as the complexity of the environment increases, RBA exhibits superior robustness, and has the merits of fewer iterations, high success rate and high efficiency. Ultimately, real-world experimental results demonstrate that RBA can accomplish the global optimal path planning in real time, and along the optimal path, the robot can reach the target safely and efficiently.
Nevertheless, our work encompasses the following limitations. On the one hand, only the path planning problem in static scenes is considered, while the influence of dynamic obstacles is ignored. On the other hand, only the single-objective optimization problem is solved, while the multi-objective optimization situation is not comprehensively taken into account. Therefore, based on the above defects, in future work, we will comprehensively consider constraints such as path length, collision risk degree and path smoothness to address the path planning issue of mobile robots in static and dynamic environments.
Supporting information S1 Data. Code and data. This file provides the relevant code for Test case 1 and Test case 2, as well as the experimental data for Figs 2-4 and 6-8. (ZIP)